专利摘要:
PROCESSING OF AUDIO SIGNALS DURING HIGH FREQUENCY RECONSTRUCTION. The present application concerns a High Frequency Reconstruction / Regeneration (HFR) of audio signals. In particular, the present application relates to a method and system for performing HFR reconstruction of audio signals with wide variations in the energy level across the low frequency range that is used to reconstruct the high frequencies of the audio signal. A system configured to generate a plurality of high frequency subband signals covering a high frequency range from among a plurality of low frequency subband signals is described. The system comprises a means for receiving the plurality of low frequency subband signals; a means for receiving a set of target energies, each target energy covering a different target range within the high frequency range and being indicative of the desired energy of one or more high frequency subband signals that are within the range target; a means for generating the plurality of high frequency subband signals from among the plurality of (...) signals.
公开号:BR112012024360B1
申请号:R112012024360-8
申请日:2011-07-14
公开日:2020-11-03
发明作者:Kristofer Kjoerling
申请人:Dolby International Ab;
IPC主号:
专利说明:

TECHNICAL FIELD
[0001] This application relates to a High Frequency Reconstruction / Regeneration (HFR) of audio signals. In particular, the present application relates to a method and system for performing HFR reconstruction of audio signals with wide variations in the energy level across the low frequency range that is used to reconstruct the high frequencies of the audio signal. BACKGROUND OF THE INVENTION
[0002] HFR reconstruction technologies, such as Spectral Band Replication (SBR) technology, significantly improve the coding efficiency of traditional perceptual audio codecs. In combination with MPEG-4 standard Advanced Audio Coding (AAC), the HFR reconstruction forms a very efficient audio codec, which is already in use within the XM Satellite Radio and Mondiale Digital Radio system, and also standardized within the 3GPP standard, DVD Forum and others. The combination of AAC encoding and SBR replication is called aacPlus. It is part of the MPEG-4 standard when it is referred to as the High Efficiency AAC encoding Profile (HE-AAC). In general, HFR reconstruction technology can be combined with any perceptual audio codec in a compatible way back and forth, thus offering the possibility to update established broadcast systems, such as MPEG Layer 2 used in the Eureka DAB system. HFR reconstruction methods can also be combined with voice codecs to allow broadband speech at ultra-low bit rates.
[0003] The basic idea behind HFR reconstruction is the observation that, generally, a strong correlation between the characteristics of the high frequency range of a signal and the characteristics of the low frequency range of the same signal is present. Thus, a good approximation for the representation of the original high frequency range of a signal can be obtained by transposing the signal from the low frequency range to the high frequency range.
[0004] This transposition concept was established in Publication WO 98/57436 which is incorporated by reference as a method for recreating a high frequency band from a lower frequency band of an audio signal. Substantial savings in the bit rate can be achieved by using this concept of audio coding and / or voice coding. In the following, reference will be made to audio coding, but it should be noted that the methods and systems described are equally applicable to voice coding and unified audio and voice coding (USAC).
[0005] High Frequency Reconstruction can be performed in the time domain or in the frequency domain, using a filter bank or transform of choice. The process usually involves several steps, in which the two main operations are, first, to create a high frequency excitation signal, and then to shape the high frequency excitation signal in order to approximate the spectral envelope of the spectrum. original high frequency. The step of creating a high frequency excitation signal can, for example, be based on a single sideband modulation (SSB), in which a sinusoid with a frequency ω is mapped to a sinusoid with a frequency ω + Δω, in which Δω is a fixed frequency change. In other words, the high frequency signal can be generated from the low frequency signal by means of a "copy" operation from low frequency subbands to high frequency subbands. Another method for creating a high frequency excitation signal may involve the harmonic transposition of low frequency subbands. The harmonic transposition of a T order is typically designed to map a sinusoid of frequency ω of the low frequency signal to a sinusoid with a frequency Tω, where T> 1, of the high frequency signal.
[0006] HFR reconstruction technology can be used as part of the source encoding systems, in which various control information to guide the HFR reconstruction process is transmitted from an encoder to a decoder, together with a representation of the signal narrowband / low frequency. For systems in which no additional control signals can be transmitted, the process can be applied to the decoder, with the appropriate control data estimated from the information available in the decoder.
[0007] The adjustment of the aforementioned envelope of the high frequency excitation signal aims to obtain a spectral shape that approximates the spectral shape of the original high band. For this, the spectral form of the high frequency signal has to be modified. In other words, the adjustment to be applied to the high band becomes a function of the existing spectral envelope and the desired spectral envelope.
[0008] For systems operating in the frequency domain, for example, the HFR reconstruction systems implemented in a pseudo QMF interface filter bank, the methods of the previous technique are not ideal, since the creation of the signal high bandwidth, through the combination of several contributions from the original frequency range, introduces an artificial spectral envelope in the high band to be adjusted in the envelope. In other words, the high band or high frequency signal generated from the low frequency signal during the HFR reconstruction process typically exhibits an artificial spectral envelope (typically comprising spectral deviations). This creates difficulties for the spectral envelope adjuster, since the adjuster must not only be able to apply the desired spectral envelope with an appropriate time and frequency resolution, but the adjuster must also be able to undo the spectral characteristics. artificially introduced by the HFR reconstruction signal generator. This imposes difficult design limitations on the envelope adjuster. As a result, these difficulties tend to result in a noticeable loss of high-frequency energy, and audible discontinuities in the spectral form of the high-band signal, in particular for voice-type signals. In other words, conventional HFR reconstruction signal generators tend to introduce discontinuities and level variations in the high band signal for signals that have large level variations in the low band range, for example, hissing. When the envelope adjuster is subsequently exposed to this high band signal, the envelope adjuster cannot, reasonably and consistently, separate the newly introduced discontinuity from any natural spectral characteristic of the low band signal.
[0009] This document describes a solution to the problem mentioned above, resulting in an increase in the perceived audio quality. In particular, this document describes a solution to the problem of generating a high-band signal from a low-band signal, in which the spectral envelope of the high-band signal is effectively adjusted to resemble the shape of the original spectral envelope in the high band without the introduction of unwanted artifacts. SUMMARY OF THE INVENTION
[00010] This document proposes an additional correction step as part of the generation of high frequency reconstruction signal. As a result of the additional correction step, the audio quality of the high frequency component of the high band signal becomes higher. The additional correction step can be applied to all source coding systems that use high frequency reconstruction techniques, as well as any single-ended post-processing method or system that aims to recreate the high frequencies of a signal. audio.
[00011] According to one aspect, a system configured to generate a plurality of high frequency subband signals covering a high frequency range is described. The system can be configured to generate the plurality of high frequency subband signals among a plurality of low frequency subband signals. The plurality of low frequency subband signals can be the subband signals of a low band or narrow band audio signal, which can be determined using an analysis or transform filter bank. In particular, the plurality of low-frequency sub-band signals can be determined from a low-band signal in the time domain using a QMF analysis filter bank (quadrature mirror filter) or an FFT (Fast Transform) Fourier). The plurality of high frequency subband signals generated may correspond to an approximation of the high frequency subband signals of an original audio signal from which the plurality of low frequency subband signals was derived. In particular, the plurality of low frequency subband signals and the plurality of regenerated high frequency subband signals may correspond to the subbands of a QMF filter bank and / or an FFT transform.
[00012] The system may comprise a means for receiving the plurality of low frequency subband signals. Therefore, the system can be placed downstream of the analysis filter bank or the transform that generates the plurality of low frequency subband signals from a low band signal. The low band signal can be an audio signal that has been decoded in a core decoder from a received bit stream. The bit stream can be stored on a storage medium, for example, on a compact disc or on a DVD, or the bit stream can be received at the decoder by a transmission medium, for example, an optical transmission medium or radio.
[00013] The system can comprise a means for receiving a set of target energies, which can also be referred to as the scale factor energies. Each target energy can cover a different target range, which can also be referred to as a scale factor band, within the high frequency range. Typically, the set of target intervals, which corresponds to the set of target energies, covers the total high frequency range. The target energy of the target energy pool is generally indicative of the desired energy of one or more signals from the high frequency subband that are within the corresponding target range. In particular, the target energy can correspond to the desired average energy of one or more high frequency subband signals that are within the corresponding target range. The target energy of a target range is typically derived from the high band signal energy of the original audio signal within the target range. In other words, the target set of energies usually describes the spectral envelope of the high band portion of the original audio signal.
[00014] The system may comprise a means for generating the plurality of high frequency subband signals from among the plurality of low frequency subband signals. For this purpose, the means for generating the plurality of high frequency subband signals can be configured so as to perform a copy transposition of the plurality of low frequency subband signals and / or perform a harmonic transposition of the plurality of low frequency subband signals.
[00015] In addition, the means for generating the plurality of high frequency subband signals can take into account a plurality of spectral gain coefficients during the process of generating the plurality of high frequency subband signals. The plurality of spectral gain coefficients can be associated with the plurality of low frequency subband signals, respectively. In other words, each of the low frequency subband signals within the plurality of low frequency subband signals can have a corresponding spectral gain coefficient among the plurality of spectral gain coefficients. A spectral gain coefficient among the plurality of spectral gain coefficients can be applied to the corresponding low frequency subband signal.
[00016] The plurality of spectral gain coefficients can be associated with the energy of the respective plurality of signals from the low frequency subband. In particular, each spectral gain coefficient can be associated with the energy of its corresponding low frequency subband signal. In one embodiment, a spectral gain coefficient is determined based on the energy of the corresponding low frequency subband signal. For this purpose, a frequency dependent curve can be determined based on the plurality of energy values of the plurality of low frequency subband signals. In this case, a method for determining the plurality of gain coefficients can be based on the frequency-dependent curve that is determined from a representation (for example, logarithmic) of the energies of the plurality of low frequency subband signals.
[00017] In other words, the plurality of spectral gain coefficients can be derived from a frequency dependent curve adjusted to the energy of the plurality of low frequency subband signals. In particular, the frequency-dependent curve can be of a predetermined order / degree polynomial. Alternatively or in addition, the frequency dependent curve may comprise different curve segments, the different curve segments being adjusted to the energy of the plurality of low frequency subband signals at different frequency ranges. The different curve segments can be different polynomials of a predetermined order. In one embodiment, the different curve segments are polynomials of a zero order, such that the curve segments represent the average energy values of the energy of the plurality of low frequency subband signals within the corresponding frequency range. In another embodiment, the frequency dependent curve is adjusted to the energy of the plurality of low frequency subband signals by performing a moving average filtering operation over the different frequency ranges.
[00018] In one embodiment, a gain coefficient out of the plurality of gain coefficients is obtained from the difference in the average energy of the plurality of low-frequency subband signals and a corresponding value of the frequency-dependent curve . The corresponding value of the frequency-dependent curve can be a value of the curve with a frequency within the frequency range of the low-frequency subband signal to which the gain coefficient corresponds.
[00019] Normally, the energy of the plurality of low frequency subband signals is determined in a certain time grid, for example, on a frame-by-frame basis, that is, the energy of a subband signal of low frequency within a time interval defined by the time grid corresponds to the average energy of the low frequency subband signal samples within the time interval, for example, within a frame. Therefore, a different plurality of spectral gain coefficients can be determined based on the chosen time grid, for example, a different plurality of spectral gain coefficients can be determined for each frame of the audio signal. In one embodiment, the plurality of spectral gain coefficients can be determined on a sample-by-sample basis, for example, by determining the energy of the plurality of low-frequency sub-bands, using a floating window through the samples of each of the low frequency subband signals. It should be noted that the system can comprise a means for determining the plurality of spectral gain coefficients from the plurality of low frequency subband signals. This medium can be configured in order to carry out the methods mentioned above for determining the plurality of spectral gain coefficients.
[00020] The means for generating among the plurality of high frequency subband signals can be configured in order to amplify the plurality of low frequency subband signals using the respective plurality of spectral gain coefficients. Although reference is made to the "amplification step" or "amplification" below, the "amplification" operation can be replaced by other operations, such as the "multiplication" operation, a "rescheduling" operation or an operation "fit". Amplification can be done by multiplying a sample of a low frequency subband signal by its corresponding spectral gain coefficient. In particular, the means for generating the plurality of high frequency subband signals can be configured to determine a sample of a high frequency subband signal at a given time from samples of a low frequency subband signal at the given time point or at least at a preceding time point. In addition, samples of the low frequency subband signal can be amplified by the respective spectral gain coefficient of the plurality of spectral gain coefficients. In one embodiment, the means for generating the plurality of high frequency subband signals is configured so as to generate the plurality of high frequency subband signals among the plurality of low frequency subband signals of according to the "copy" algorithm specified in the SBR replication of the MPEG-4 standard. The plurality of low frequency subband signals used in this "copy" algorithm can be amplified using the plurality of spectral gain coefficients, the "amplification" operation being performed as described above.
[00021] The system may include a means to adjust the energy of the plurality of high frequency subband signals using the target energy set. This operation is generally referred to as an adjustment of the spectral envelope. The adjustment of the spectral envelope can be done by adjusting the energy of the plurality of high frequency subband signals in such a way that the average energy of the plurality of high frequency subband signals that are within a range target corresponds to the corresponding target energy. This can be achieved by determining an envelope adjustment value from the energy values of the plurality of high frequency subband signals that are within a target range and a corresponding target energy. In particular, the envelope fit value can be determined from a relationship between the target energy and the energy values of the plurality of high frequency subband signals that are within a corresponding target range. This envelope adjustment value can be used to adjust the energy of the plurality of high frequency subband signals.
[00022] In one embodiment, the means for adjusting the energy comprises a means for limiting the energy adjustment of the high frequency subband signals that are within a limitation range. Typically, the limiting range covers more than one target range. The means for limiting is generally used to prevent undesirable amplification of noise within certain high frequency subband signals. For example, the limiting means can be configured in order to determine an average envelope fit value among the envelope fit values corresponding to the target intervals covered by or within the limitation range. In addition, the means for limiting can be configured to limit the energy adjustment of the high frequency subband signals that are within the limitation range to a value that is proportional to the average adjustment value of the surroundings.
[00023] Alternatively or additionally, the means for adjusting the energy of the plurality of high frequency subband signals may comprise a means for ensuring that the adjusted high frequency subband signals are within the target range in particular have the same energy. This medium is often referred to as an "interpolation" medium. In other words, the "interpolation" means ensures that the energy of each of the high frequency subband signals within the particular target range corresponds to the target energy. The "interpolation" means can be implemented by adjusting each high frequency subband signal within the particular target range separately so that the energy of the adjusted high frequency subband signal corresponds to the associated target energy to the particular target range. This can be achieved by determining a different envelope fit value for each high frequency subband signal within the particular target range. A different envelope fit value can be determined based on the energy of the particular high frequency subband signal and the target energy that corresponds to the particular target range. In one embodiment, an envelope fit value for a particular high frequency subband signal is determined based on the relationship between the target energy and the energy of the particular high frequency subband signal.
[00024] The system can also comprise a means for receiving control data. The control data can be indicative of whether it is possible to apply the plurality of spectral gain coefficients in order to generate the plurality of high frequency subband signals. In other words, the control data can be indicative of whether the additional gain adjustment of the low frequency subband signals should be performed or not. Alternatively or in addition, the control data can be indicative of a method that should be used to determine the plurality of spectral gain coefficients. For example, the control data can be indicative of the predetermined order of the polynomial that should be used in order to determine the frequency dependent curve adjusted to the energies of the plurality of low frequency subband signals. Control data is usually received from a corresponding encoder, which analyzes the original audio signal and informs the corresponding decoder or HFR reconstruction system on how to decode the bit stream.
[00025] According to another aspect, an audio decoder configured to decode a bit stream comprising a low frequency audio signal and comprising a set of target energies describing the spectral envelope of a signal is described high frequency audio. In other words, an audio decoder configured to decode a bit stream representative of a low frequency audio signal and representative of a set of target energies describing the spectral envelope of a high frequency audio signal is described. The audio decoder may comprise a core decoding and / or transforming unit configured to determine a plurality of low frequency subband signals associated with the low frequency audio signal from the bit stream. Alternatively or in addition, the audio decoder may comprise a high frequency generation unit according to the system described in this document, the system being configurable to determine a plurality of subband signals from high frequency among the plurality of low frequency subband signals and the set of target energies. Alternatively or in addition, the decoder may comprise a fusion and / or reverse transformation unit configured to generate an audio signal among the plurality of low-frequency sub-band signals and the plurality of high frequency subband. The fusion and reverse transformation unit may comprise a synthesis filter bank or a transform, for example, an inverse QMF filter bank or an inverse FFT transform.
[00026] According to another aspect, an encoder configured to generate control data from an audio signal is described. The audio encoder may comprise a means for analyzing the spectral shape of the audio signal and determining a degree of deviations from spectral envelopes introduced when regenerating a high frequency component of the audio signal from a low frequency component of the audio signal. audio signal. Accordingly, the encoder may comprise certain elements of a corresponding decoder. In particular, the encoder may comprise an HFR reconstruction system, as described herein. This would allow the encoder to determine the degree of discontinuity in the spectral envelope that could be introduced into the high frequency component of the decoder audio signal. Alternatively or in addition, the encoder may comprise a means for generating control data to control the regeneration of the high frequency component based on the degree of discontinuities. In particular, the control data can correspond to the control data received by the corresponding decoder or the HFR reconstruction system. The control data can be indicative of whether it is possible to use the plurality of spectral gain coefficients during the HFR reconstruction process and / or what predetermined polynomial order to use in order to determine the plurality of spectral gain coefficients. In order to determine this information, a relationship between the selected parts of the low frequency range, that is, the frequency range covered by the plurality of low frequency subband signals, can be determined. This relationship information can be determined, for example, by studying the lower frequencies of the low band, and the higher frequencies of the low band in order to evaluate the spectral variation of the low band signal that will be used next in the decoder. for high frequency reconstruction. A high ratio could indicate a greater degree of discontinuity. Control data can also be determined using signal type detectors. For example, the detection of speech signals may indicate a greater degree of discontinuity. On the other hand, the detection of prominent sinusoid in the original audio signal can lead to control data that indicates that the plurality of spectral gain coefficients should not be used during the HFR reconstruction process.
[00027] According to another aspect, a method for generating a plurality of high frequency subband signals covering a high frequency interval among a plurality of low frequency subband signals is described. The method can comprise the steps of receiving the plurality of low frequency subband signals and / or receiving a set of target energies. Each target energy can cover a different target range within the high frequency range. In addition, each target energy can be indicative of the desired energy of one or more high frequency subband signals that are within the target range. The method may comprise the step of generating the plurality of high frequency subband signals from among the plurality of low frequency subband signals and from a plurality of spectral gain coefficients associated with the plurality of subband signals of low frequency, respectively. Alternatively or in addition, the method may comprise the step of adjusting the energy of the plurality of high frequency subband signals using the target energy set. The step of adjusting the energy can comprise the step of limiting the energy adjustment of the high frequency subband signals that are within a limiting range. Typically, the limiting range covers more than one target range.
[00028] According to an additional aspect, a method for decoding a bit stream representative of or comprising a low frequency audio signal and a set of target energies describing the spectral envelope of a high audio signal is described corresponding frequency. Typically, the low frequency and high frequency audio signals correspond to a low frequency and high frequency component of the same original audio signal. The method may comprise the step of determining a plurality of low frequency subband signals associated with the low frequency audio signal from the bit stream. Alternatively or in addition, the method may comprise the step of determining a plurality of high frequency subband signals among the plurality of low frequency subband signals and the target energy pool. This step is generally performed according to the HFR reconstruction methods described in this document. Then, the method can include the step of generating an audio signal among the plurality of low frequency subband signals and the plurality of high frequency subband signals.
[00029] According to another aspect, a method for generating control data from an audio signal is described. The method can comprise the step of analyzing the spectral shape of the audio signal in order to determine a degree of discontinuities reintroduced when regenerating a high frequency component of the audio signal from a low frequency component of the audio signal. In addition, the method can comprise the step of generating control data to control the regeneration of the high frequency component based on the degree of discontinuities.
[00030] According to an additional aspect, a software program is described. The software program can be adapted to run on a processor and to perform the method steps described in this document when performed on a computing device.
[00031] According to another aspect, a storage medium is described. The storage medium may include a software program adapted to run on a processor and to perform the method steps described in this document when performed on a computing device.
[00032] According to another aspect, a computer program product is described. The computer program may include executable instructions for performing the method steps described in this document when performed on a computer.
[00033] It should be noted that the methods and systems, including their preferred embodiments, as described in the present patent application can be used independently or in combination with other methods and systems described in this document. In addition, all aspects of the methods and systems described in the present patent application can be arbitrarily combined. In particular, the characteristics of the embodiments can be combined with each other in an arbitrary manner. BRIEF DESCRIPTION OF THE DRAWINGS
[00034] The present invention is explained below by means of illustrative examples with reference to the accompanying drawings, in which:
[00035] Figure 1a shows the absolute spectrum of an exemplary high band signal before adjusting the spectral envelope; Figure 1b illustrates an exemplary relationship between time frames of audio data and time edges of spectral envelope surroundings; Figure 1c illustrates the absolute spectrum of an exemplary high band signal before adjusting the spectral envelope, and the corresponding scale factor bands, the limiter bands, and HF (high frequency) patches; figure 2 illustrates a modality of an HFR reconstruction system in which the copy process is complemented with an additional gain adjustment step; figure 3 shows an approximation of the gross spectral envelope of an exemplary low band signal; Figure 4 illustrates a modality of an additional gain adjuster that operates on optional control data, the QMF filter subband samples, and the production of a gain curve; figure 5 illustrates a more detailed embodiment of the additional gain adjuster in figure 4; figure 6 illustrates an embodiment of an HFR reconstruction system with a narrowband signal as an input and a broadband signal as an output; figure 7 illustrates an embodiment of an HFR reconstruction system incorporated in the SBR replication module of an audio decoder; figure 8 illustrates a modality of the high frequency reconstruction module of an exemplary audio decoder; figure 9 illustrates an exemplary encoder embodiment; figure 10a illustrates the spectrogram of an exemplary vocal segment that is decoded using a conventional decoder; figure 10b illustrates the vocal segment spectrogram of figure 10a that was decoded using a decoder that applies additional gain adjustment processing; and figure 10c illustrates the spectrogram of the vocal segment of figure 10 for the original uncoded signal. DESCRIPTION OF THE PREFERRED EMBODIMENTS
[00036] The modalities described below are merely illustrative for the principles of the present invention "PROCESSING AUDIO SIGNALS DURING HIGH FREQUENCY RECONSTRUCTION". It should be understood that changes and variations in the provisions and details described in this document will be evident to others skilled in the art. It is therefore intended to be limited only to the scope of the embodiments and not to the specific details presented by way of description and explanation of the modalities of the present invention.
[00037] As described above, audio decoders using HFR reconstruction techniques typically comprise an HFR reconstruction unit for generating a high frequency audio signal and a subsequent spectral envelope adjustment unit in order to adjust the spectral envelope of the high frequency audio signal. When adjusting the spectral envelope of the audio signal, this adjustment is typically done through a filter bank implementation, or through time domain filtering. The adjustment can also endeavor to make a correction of the absolute spectral envelope, or it can be done by means of filtration, which also corrects the phase characteristics. Either way, the fit is typically a combination of two steps, the removal of the current spectral envelope, as well as the application of the target spectral envelope.
[00038] It is important to note that the methods and systems described in this document are not only aimed at removing the spectral envelope from the audio signal. The methods and systems endeavor to make an appropriate spectral correction of the spectral envelope of the low band signal as part of the high frequency regeneration step so as not to present the deviations of spectral envelopes of the high frequency spectrum. created by the combination of different segments of the low band, that is, of the low frequency signal, displaced or transposed to different bands of high band frequencies, that is, of the high frequency signal.
[00039] In figure 1a, a stylistically drawn spectrum 100, 110 of the output of an HFR reconstruction unit is displayed, before going to the envelope adjuster. In the upper panel, a copy method (with two patches) is used in order to generate the high band signal 105 from the low band signal 101, for example, the method used in Spectral Band Replication of the MPEG- 4 which is outlined in the document "ISO / IEC 14496-3 Information Technology - Coding of audio-visual objects - Part 3", which is incorporated by reference in this document. The copy method translates parts of the lower frequencies 101 to the higher frequencies 105. In the lower panel, a harmonic transposition method (with two patches) is used to generate the high band signal 115 from the low band signal 111 , for example, the harmonic transposition method of the USAC encoding of MPEG-D standard which is described in the document "MPEG-D USAC: ISO / IEC 23003-3 - Unified Speech and Audio Coding" and which is incorporated by way of this document. of reference.
[00040] In the subsequent envelope adjustment phase, a target spectral envelope is applied over the high frequency components 105, 115. As can be seen from the spectrum 105, 115 that goes to the envelope adjuster, the discontinuities ( notably at the edges of the patches) can be seen in the spectral form of the high band excitation signal 105, 115, that is, the high band signal that enters the envelope adjuster. These discontinuities stem from the fact that several contributions from the low frequencies 101, 111 are used in order to generate the high band 105, 115. As you can see, the spectral form of the high band signal 105, 115 refers to the spectral form of the low-band signal 101, 111. Therefore, the spectral shapes in particular of the low-band signal 101, 111, for example, a gradient shape illustrated in Figure 1a, can result in discontinuities in the general spectrum 100, 110.
[00041] In addition to the spectrum 100, 110, figure 1a shows the exemplary frequency bands 130 of the spectral envelope data that represent the target spectral envelope. These frequency bands 130 are referred to as scale factor bands or target ranges. Typically, a target energy value, that is, a scale factor energy, is specified for each target range, that is, for each scale factor band. In other words, the scale factor bands define the effective frequency resolution of the target spectral envelope, since there is typically only a single target energy value per target range. When using the scale factors or target energies specified for the scale factor bands, the subsequent envelope adjuster works to adjust the high band signal so that the energy of the high band signal within the factor bands scale is equal to the energy of the received spectral envelope data, that is, the target energy, for the respective scale factor bands.
[00042] In figure 1c, a more detailed description is provided through an example of audio signal. The graph shows the spectrum of an audio signal in the real world 121 that enters the surround adjuster, as well as the corresponding original signal 120. In this particular example, the SBR replication range, that is, the amplitude of the high frequency, starts at 6.4 kHz, and consists of three different replications of the low band frequency range. The frequency ranges of the different replications are indicated by "patch 1", "patch 2", and "patch 3". It is evident from the spectrogram that the patch introduces discontinuities in the spectral envelope around 6.4 kHz, 7.4 kHz, and 10.8 kHz. In the present example, these frequencies correspond to the patch edges.
[00043] Figure 1c further illustrates the scale factor bands 130, as well as the limitation bands 135, whose function will be described in more detail below. In the illustrated modality, the envelope adjuster of the SBR replication of MPEG-4 standard is used. This envelope adjuster works with a QMF filter bank. The main aspects of the operation of such an envelope adjuster are: - calculating the average energy over a whole range of scale factor 130 of the input signal to the envelope adjuster, that is, the signal coming out of the HFR reconstruction unit, in other words, the average energy of the regenerated high band signal is calculated within each target band / interval scale factor 130; - determine a gain value, also referred to as the envelope adjustment value, for each scale factor band 130, the envelope adjustment value being the square root of the energy relationship between the target energy (that is, the target energy received from an encoder) and the average energy of the regenerated high band signal 121 within the respective scale factor band 130; - apply the respective setting value to the frequency band of the regenerated high band signal 121, the frequency band corresponding to the respective scale factor band 130.
[00044] In addition, the envelope adjuster can comprise other steps and variations, in particular: - a limiting feature, which limits the maximum envelope adjustment value allowed to be applied over a given frequency band, that is, above a limiting band 135. The maximum allowable envelope adjustment value is a function of the envelope adjustment values determined for the different scaling factor bands 130 that fall within a limitation band 135. In particular, the value maximum allowable envelope adjustment is a function of the average of the envelope adjustment values determined for the different scale factor bands 130 that fall within a limitation band 135. As an example, the maximum allowable envelope adjustment value it can be the average value of the surrounding adjustment values in question multiplied by a limiting factor (such as 1.5). The limiting feature is typically applied in order to limit the introduction of noise into the regenerated high-band signal 121. This is particularly relevant for audio signals that contain prominent sinusoid, that is, audio signals with a spectrum with distinct peaks in certain frequencies. Without using the limiting functionality, the envelope fit values in question could be determined for the scale factor bands 130 for which the original audio signal includes these distinct peaks. As a result, the spectrum of the full scale factor band 130 (and not just the peak in question) will be adjusted, thus introducing noise. - an interpolation feature, which allows the envelope fit values to be calculated for each individual QMF filter subband within a scale factor band, instead of calculating a single envelope fit value for the factor band full-scale. Since the scale factor bands typically comprise more than one QMF filter subband, an envelope fit value can be calculated as the relationship between the energy of a particular QMF filter subband within the band. scale factor and the target energy received from the encoder, instead of calculating the ratio between the average energy of all QMF filter subbands within the scale factor band and the target energy received from the encoder. Therefore, a different envelope fit value can be determined for each QMF filter subband within a scale factor band. It should be noted that the target energy value received for a scale factor band typically corresponds to the average energy of the frequency range within the original signal. It is up to the decoder operation how to apply the average target energy received to the corresponding frequency band of the regenerated high band signal. This can be done by applying a full envelope adjustment value for the QMF filter subbands within a scaling band of the regenerated highband signal or by applying an individual envelope adjustment value. for each QMF filter subband. This latter approach can be thought of as if the received envelope information (that is, a target energy per scale factor band) is "interpolated" through the QMF filter subbands within a scale factor band in order to provide a higher frequency resolution. Therefore, this approach is referred to as an "interpolation" in the SBR replication of MPEG-4 standard.
[00045] Returning to figure 1c, it can be seen that the envelope adjuster would have to apply high envelope adjustment values in order to match the spectrum 121 of the signal that goes to the envelope adjuster to spectrum 120 of the original signal. It can also be observed that, due to the discontinuities, large variations in the surrounding adjustment values occur within the limitation bands 135. As a result of such large variations, the surrounding adjustment values that correspond to the local minimums of the regenerated spectrum 121 will be limited by the limiting functionality of the envelope adjuster. As a result, discontinuities within the regenerated spectrum 121 will remain, even after performing the envelope adjustment operation. On the other hand, when no limiting functionality is used, an undesirable noise can be introduced, as described above.
[00046] Thus, a problem for the regeneration of a high band signal occurs for any signal that has large variations in the level above the low band range. This problem is due to discontinuities introduced during high frequency regeneration of the high band. When the envelope adjuster is subsequently exposed to this regenerated signal, it cannot reasonably and consistently separate the newly introduced discontinuity from any spectral characteristics of the "real world" from the low band signal. This problem has two effects. First, spectral shapes are introduced into the high band signal that the envelope adjuster cannot compensate for. Therefore, the output has the wrong spectral shape. Second, an instability effect is perceived, due to the fact that this effect comes and goes as a function of the low band spectral characteristics.
[00047] This document solves the aforementioned problem by describing a method and system that provide a high-band HFR reconstruction signal at the entrance of the envelope adjuster that does not show spectral discontinuities. For this purpose, it is proposed to remove or reduce the spectral envelope of the low band signal during the execution of a high frequency regeneration. In this way, it will be possible to avoid introducing any spectral discontinuities in the high band signal before adjusting the envelope. As a result, the envelope adjuster will not have to deal with such spectral discontinuities. In particular, a conventional envelope adjuster can be used, with the envelope adjuster limiting functionality being used to avoid introducing noise into the regenerated high band signal. In other words, the method and system described can be used to regenerate a high-band HFR reconstruction signal with little or no spectral discontinuity and a low noise level.
[00048] It should be noted that the temporal resolution of the envelope adjuster may be different from the proposed temporal resolution of the processing for the spectral envelope during the generation of high band signal. As indicated above, the processing of the spectral envelope during the regeneration of the high band signal is designed to modify the spectral envelope of the low band signal, in order to alleviate the processing within the subsequent envelope adjuster. This processing, that is, the modification of the spectral envelope of the low band signal, for example, can be carried out once per audio frame, and the enveloping adjuster can adjust the spectral envelope for several time intervals, ie that is, use several received spectral environments. This is outlined in figure 1b, in which the time grid 150 of the spectral envelope data is represented in the upper panel, and the time grid 155 for processing the spectral envelope of the low band signal during the high band signal regeneration. is shown in the bottom panel. As can be seen in the example in figure 1 b, the time edges of the spectral envelope data vary according to time, while the processing of the spectral envelope of the low band signal operates in a fixed time grid. It can also be seen that several envelope adjustment cycles (represented by time edges 150) can be performed during a cycle of processing the low band signal's spectral envelope. In the illustrated example, the processing of the spectral envelope of the low band signal operates on a frame-by-frame basis, which means that a different plurality of spectral gain coefficients is determined for each frame of the signal. It should be noted that the processing of the low band signal can operate in any time grid, and that the time grid of such processing does not need to coincide with the time grid of the spectral envelope data.
[00049] In figure 2, a filter bank based on the HFR 200 reconstruction system is described. The HFR 200 reconstruction system operates using a pseudo QMF filter bank and system 200 can be used to produce the high band and low band signal 100 illustrated in the top panel of figure 1a. However, an additional gain adjustment step was added as part of the High Frequency Generation process, which, in the example illustrated, turns out to be a copy process. The low frequency input signal is analyzed by a 32 subband QMF filter 201 in order to generate a plurality of low frequency subband signals. Some or all of the low frequency subband signals are corrected for higher frequency locations according to an HF (high frequency) generation algorithm. In addition, the plurality of low-frequency sub-bands is sent directly to the synthesis filter bank 202. The aforementioned synthesis filter bank 202 is a 64-sub-band QMF reverse filter 202. For the particular implementation illustrated in Figure 2, using a 32-subband QMF synthesis filter bank 201 and using a 64 sub-band QMF synthesis filter bank 202 will produce an output signal sampling rate of two times the input sample rate of the input signal. It should be noted, however, that the systems described in this document are not limited to systems with different input and output sampling rates. A large number of different sample rate ratios can be imagined by those skilled in the art.
[00050] As illustrated in figure 2, the lower frequency sub-bands are mapped to higher frequency sub-bands. The gain adjustment phase 204 is introduced as part of this copy process. The created high frequency signal, that is, the generated plurality of high frequency subband signals, is sent to the envelope adjuster 203 (possibly comprising a limiting and / or interpolation functionality), before combining with the plurality of low-frequency subband signals in the synthesis filter bank 202. When using such an HFR 200 reconstruction system, and in particular, when using a gain adjustment phase 204, the introduction of discontinuities of spectral surroundings, as illustrated in figure 1, can be avoided. For this purpose, the gain adjustment phase 204 modifies the spectral envelope of the low band signal, that is, the spectral envelope of the plurality of low frequency subband signals, in such a way that the modified low band signal can be used to generate a high-band signal, that is, a plurality of high-frequency sub-band signals, which show no discontinuities, mainly discontinuities at the patch edges. With reference to figure 1c, the additional gain adjustment phase 204 ensures that the spectral envelope 101, 111 of the low band signal is modified in such a way that there is no discontinuity or only limited discontinuities in the generated high band signal 105, 115 .
[00051] The modification of the spectral envelope of the low band signal can be obtained by applying a gain curve for the spectral envelope of the low band signal. Such a gain curve can be determined by a gain curve determination unit 400 shown in figure 4. Module 400 has input from the QMF 402 filter data corresponding to the frequency range of the low band signal used to recreate the high band. In other words, the plurality of low frequency subband signals are sent to the gain curve determination unit 400. As already indicated, only a subset of the available QMF filter subbands of the low band signal can be used to generate the high-band signal, ie only a subset of the available QMF filter sub-bands can be sent to the gain curve determination unit 400. In addition, module 400 can receive optional control data 404, for example, control data sent from a corresponding encoder. Module 400 emits a 403 gain curve that must be applied during the high frequency regeneration process. In one embodiment, the 403 curve gain is applied to the low band signal QMF filter sub-bands, which are used to generate the high band signal. In other words, the 403 curve gain can be used within the copy process of the HFR reconstruction process.
[00052] The optional 404 control data may include information about the resolution of the gross spectral envelope that must be estimated in module 400, and / or information about the suitability of applying the gain adjustment process. Therefore, the 404 control data can control the amount of additional processing involved during the gain adjustment process. The control data 404 can also trigger a derivation of the additional gain adjustment processing, if signals that do not work well for the estimation of gross spectral envelope occur, for example, signals that comprise individual sinusoidals.
[00053] In figure 5, a more detailed view of module 400 in figure 4 is illustrated. The QMF filter data 402 of the low band signal is sent to an envelope estimation unit 501 that estimates the spectral envelope, for example, on a logarithmic energy scale. The spectral envelope is then sent to a module 502 that estimates the gross spectral envelope from the high resolution spectral envelope (frequency) received from the 501. envelope estimation unit. fitting a low-order polynomial to the spectral envelope data, that is, a polynomial of an order in the range of, for example, 1, 2, 3, or 4. The gross spectral envelope can also be determined by performing a moving average operation of the high resolution spectral envelope along the frequency axis. The determination of a gross spectral envelope 301 of a low band signal is shown in figure 3. It can be seen that the absolute spectrum 302 of the low band signal, that is, the energy of the QMF filter bands 302, is approximated by means of of a gross spectral envelope 301, that is, through a frequency dependent curve adjusted to the spectral envelope of the plurality of low frequency subband signals. Furthermore, it is shown that only 20 QMF filter subband signals are used for the generation of the high band signal, ie only a part of the QMF 32 filter subband signals are used in the HFR reconstruction process. .
[00054] The method used for determining the gross spectral envelope of the high resolution spectral envelope and, in particular, for determining the order of the polynomial that is adjusted to the high resolution spectral envelope can be controlled by the optional 404 control data. The order of the polynomial can be a function of the size of the frequency band 302 of the low band signal for which a gross spectral envelope 301 must be determined, and / or it can be a function of other parameters relevant to the general shape of the raw spectrum. of the frequency range in question 302 of the low band signal. Polynomial fit calculates a polynomial that approximates the data in a sense of error to the least squares. Below, a preferred modality is described using the Matlab code:

[00055] In the code above, the input is the spectral envelope (LowEnv) of the low-band signal obtained by averaging samples from the QMF filter subband on a per band basis over a time frame corresponding to the current time of the data operated by the next envelope adjuster. As indicated above, the gain adjustment processing of the lower band signal can be performed on several other time grids. In the example above, the estimated absolute spectral envelope is expressed in a logarithmic domain. A low-order polynomial, in the example above, a 3-order polynomial, is fitted to the data. Given the polynomial, a gain curve (GainVec) is calculated from the difference in the average energy of the low band signal and the curve (lowBandEnvSlope) obtained from the polynomial adjusted to the data. In the example above, the operation of determining the gain curve is done in the logarithmic domain.
[00056] The gain curve calculation is done by the 503 gain curve calculation unit. As indicated above, the gain curve can be determined from the average energy of the part of the low band signal used to regenerate the signal high-bandwidth, and from the spectral envelope of the part of the low-band signal used to regenerate the high-band signal. In particular, the gain curve can be determined from the difference in average energy and the gross spectral envelope, represented, for example, by a polynomial. That is, the calculated polynomial can be used to determine a gain curve that comprises a separate gain value, also referred to as a spectral gain coefficient, for each relevant low band signal QMF filter subband. This gain curve comprises the gain values and is then used in the HFR reconstruction process.
[00057] As an example, a HFR reconstruction generation process according to the MPBR-4 standard SBR replication is described below. The generated HF frequency signal can be derived from the following formula (see document "MPEG-4 Part 3 (ISO / IEC 14496-3), subpart 4, section 4.6.18.6.2, which is incorporated into this document as a reference):
- in which p is the subband index of the low band signal, that is, p identifies one of the plurality of low frequency subband signals. The HF frequency generation formula above can be replaced by the following formula, which performs a combined adjustment of HF frequency gain and generation: XHlgh (k, l + tHFAdj) = preGain (p) • (xLoll, (p, l + tHFAdj)) + bwArray (g (&)) • af) (p) • X10H, (p, l ~ 1 + tHFAdJ) + [bwArray (g (£))] 2 • aY (p) ■ X Low ( p, l ~ 2 + HFAdj) - in which the gain curve is referred to as preGain (p).
[00058] More details about the copying process, for example, with respect to the relationship between p and k, are specified in the MPEG-4, Part 3 mentioned above. In the above formula, Xi_ow (p, l) indicates a sample in time instance I of the low frequency subband signal having a subband index p. This sample, in combination with the previous samples, is used to generate a sample of the high frequency subband signal Xπigh (k, I) with a subband index k.
[00059] It should be noted that the gain adjustment aspect can be used on any filter bank based on the high frequency reconstruction system. This is illustrated in figure 6, in which the present invention is part of an independent HFR 601 reconstruction unit that operates on a narrowband or lowband signal 602 and emits a broadband or highband signal 604. Module 601 can receive additional control data 603 as input, and control data 603 can specify, among other things, the amount of processing used for the gain adjustment described, as well as, for example, information about the target spectral envelope of the high band signal. However, these parameters are only examples of optional control data 603. In one embodiment, relevant information can also be derived from the narrowband signal 602 sent to module 601, or by other means. That is, control data 603 can be determined within module 601 based on the information available in module 601. It should be noted that the independent HFR reconstruction unit 601 can receive the plurality of low frequency subband signals and produce the plurality of high frequency subband signals, i.e., the analysis / synthesis filter banks or the transformed ones can be placed outside the HFR 601 reconstruction unit.
[00060] As already indicated above, it may be beneficial to signal the activation of the gain adjustment processing in the bit stream from an encoder to a decoder. For certain types of signals, for example, a single sinusoid, the gain adjustment processing may not be relevant and, therefore, it may be beneficial to allow the encoder / decoder system to turn off further processing in order not to introduce undesired behavior for such extreme case signs. For this purpose, the encoder can be configured to analyze the audio signals and generate the control data that turns the gain adjustment processing on and off on the decoder.
[00061] In figure 7, the proposed gain adjustment phase is included in a 703 high frequency unit that is part of an audio codec. An example of such a HFR 703 reconstruction unit is the MPEG-4 Spectral Band Replication tool used as part of the High Efficiency AAC codec or the MPEG-D standard Unified Voice and Audio Codec. In the present embodiment, a bit stream 704 is received in an audio decoder 700. Bit stream 704 is demultiplexed in demultiplexer 701. The relevant SBR replication portion of bit stream 708 is fed to the SBR replication module or unit reconstruction HFR 703, and the bit stream in question from core encoder 707, for example, AAC data or USAC encoding core decoder data, is sent to core encoder module 702. In addition, the signal low-band or narrow-band 706 is transmitted from the core decoder 702 to the HFR 703 reconstruction unit. The present invention is incorporated as part of the SBR replication process in the HFR 703 reconstruction unit, for example, according to system described in figure 2. The HFR 703 reconstruction unit generates a broadband or high band 705 signal using the processing described in this document.
[00062] In figure 8, a modality of the high frequency reconstruction module 703 is described in more detail. Figure 8 illustrates that the HF (high frequency) signal generation can be derived from different HF frequency generation modules in different instances over time. HF frequency generation can be based on a copy transposer based on the QMF 803 filter, or the HF frequency generation can be based on an 804 harmonic transposer based on the FFT transform. For both HF frequency signal generation modules, the low band signal is processed 801, 802 as part of the HF frequency generation in order to determine a gain curve that is used in the 803 copy or 804 harmonic transposition process The outputs of the two transponders are selectively sent to the 805 envelope adjuster. The decision on which transponder signal to use is controlled by bit stream 704 or 708. It should be noted that due to the copy nature of the transponder based on QMF filter, the shape of the spectral envelope of the low band signal is maintained more clearly than when using a harmonic transposer. This will typically result in discontinuities more distinct from the spectral envelope of the high band signal when using copy transpositors. This is illustrated in the top and bottom panels of figure 1a. Consequently, it may be sufficient to just incorporate the gain adjustment for the copy method based on the QMF filter performed in module 803. However, the application of the gain adjustment for harmonic transposition 804 performed in module can also be beneficial.
[00063] In figure 9, a corresponding encoder module is described. Encoder 901 can be configured to analyze the particular input signal 903 and determine the amount of gain adjustment processing that is suitable for the particular type of input signal 903. In particular, encoder 901 can determine the degree of discontinuity in the high frequency subband signal that will be caused by the HFR 703 reconstruction unit in the decoder. For this purpose, encoder 901 can comprise an HFR 703 reconstruction unit or at least the relevant parts of the HFR 703 reconstruction unit. Based on the analysis of the input signal 903, control data 905 can be generated for the corresponding decoder . The information 905 concerning the gain adjustment to be performed on the decoder is combined in the multiplexer 902 with the audio bit stream 906, thereby forming the complete bit stream 904 which is transmitted to the corresponding decoder.
[00064] In figure 10, the output spectra of a real-world signal are displayed. In figure 10a, the output of a USAC encoding decoder of MPEG standard which decodes a mono bit stream of 12 Kbps is represented. The real-world signal section is a vocal part of a chapel recording. The abscissa corresponds to the time axis, while the ordinate corresponds to the frequency axis. When comparing the spectrogram of figure 10a to figure 10c showing the corresponding spectrogram of the original signal, it becomes evident that there are holes (see reference numbers 1001, 1002) that appear in the spectrum for the fricative parts of the voice segment. In figure 10b, the output spectrogram of the MPAC standard USAC encoding decoder that includes the present invention is shown. It can be seen from the spectrogram that the holes in the spectrum have disappeared (see reference numbers 1003, 1004 corresponding to reference numbers 1001, 1002).
[00065] The complexity of the proposed gain adjustment algorithm was calculated as weighted MOPS processing, in which functions such as POW / DIV / TRIG are weighted as 25 operations, and all other operations are weighted as one operation. Given these premises, the amounts of complexity reach about 0.1 WMOPS and an insignificant use of RAM / ROM memories. In other words, the proposed gain adjustment processing requires low memory and processing capacity.
[00066] In the present document, a method and system for the generation of a high band signal from a low band signal is described. The method and the system are adapted in order to generate a high band signal, with few spectral discontinuities or no spectral discontinuities, thus improving the perceptual performance of high frequency reconstruction methods and systems. The method and system can be easily incorporated into existing audio encoding / decoding systems. In particular, the method and the system can be incorporated without the need to modify the surround adjustment processing of existing audio encoding / decoding systems. This applies in a remarkable way to the limiting and interpolation functionality of the envelope adjustment processing that can perform its intended tasks. Therefore, the method and system described can be used to regenerate high-band signals with little or no spectral discontinuity and a low noise level. In addition, the use of control data has been described, and control data can be used in order to adapt the described method and system parameters (and computational complexity) to the type of audio signal.
[00067] The methods and systems described in this document can be implemented as software, firmware and / or hardware. Certain components can, for example, be implemented as software that runs on a digital signal processor or on a microprocessor. Other components can, for example, be implemented as hardware and or as application-specific integrated circuits. The signals found in the described methods and systems can be stored on media, such as in a random access memory or on an optical storage medium. They can be transferred over networks, such as radio networks, satellite networks, wireless networks or wired networks, for example, the internet. Typical devices that make use of the methods and systems described in this document are portable electronic devices or other consumer equipment that are used to store and / or produce audio signals. The methods and systems can also be used in computer systems, for example, on internet network servers, which store and provide audio signals, for example, music signals, for transfer (download).
权利要求:
Claims (9)
[0001]
1. System (601, 703) configured to generate a plurality of high frequency subband audio signals (604) covering a high frequency range from a plurality of low frequency subband audio signals (602 ), the system (601, 703) characterized by the fact that it comprises: - a means to receive the plurality of low frequency subband signals (602); - a means for receiving a set of target energies, each target energy covering a different target range (130) within the high frequency range and being indicative of the desired energy of one or more high frequency subband signals that are found within the target range (130); - a means for generating the plurality of high frequency subband signals (604) from the plurality of low frequency subband signals (602) and from a plurality of spectral gain coefficients associated with the plurality of low frequency subband signals (602), respectively; - a means for receiving control data (603) indicating whether it is necessary to apply the plurality of spectral gain coefficients to generate the plurality of high frequency subband signals (604); and - a means for adjusting the energy (203) of the plurality of high frequency subband signals (604) using the set of target energies.
[0002]
2. System (601, 703) according to claim 1, characterized by the fact that the means for adjusting the energy (203) comprises a means for limiting the energy adjustment of the high frequency subband signals (604) which are within a limiter range (135), and where the limiter range (135) covers more than one target range (130).
[0003]
System (601, 703) according to claim 1 or 2, characterized in that the means for adjusting the energy (203) of the plurality of high frequency subband signals (604) further comprises a means to ensure that the adjusted high frequency subband signals that fall within a particular target range (130) have the same energy.
[0004]
4. Audio decoder (700) characterized by the fact that it is configured to decode a bit stream (704) representative of a low frequency audio signal (707) and a set of target energies (708) that describes the spectral envelope of a corresponding high frequency audio signal, the audio decoder (700) comprising: - a core decoding and transforming unit (702, 201) configured to determine a plurality of subband signals from low frequency associated with the low frequency audio signal (707) from the bit stream (704); the system as defined in any one of claims 1 to 3, for generating a plurality of high frequency subband signals among the plurality of low frequency subband signals and the target energy pool; and - a fusion and reverse transformation unit (202) configured to generate an audio signal among the plurality of low frequency subband signals and the plurality of high frequency subband signals.
[0005]
5. Encoder (901) characterized by the fact that it is configured to generate control data (905) from an audio signal (903), the audio encoder (901), comprising: - a first operable medium for analyze the spectral shape of the audio signal (903) and determine a degree of discontinuity of spectral surroundings introduced when regenerating a high frequency component of the audio signal (903) from a low frequency component of the audio signal (903) ; and - a second operable means for generating control data (905) in order to control the regeneration of the high frequency component based on the degree of discontinuities; the first means being adapted to determine the degree of discontinuities of spectral surroundings by determining a relationship information, the relationship information determined by studying the lower frequencies of the low frequency component and the higher frequencies of the low frequency component, in that a high value of determined relation information is indicative of a high degree of discontinuity of spectral surroundings and a low value of determined relationship information is indicative of a low degree of discontinuity of spectral surroundings.
[0006]
6. Method for generating a plurality of high frequency subband audio signals (604) covering a high frequency range from among a plurality of low frequency subband audio signals (602), the method characterized by fact that it comprises the steps of: - receiving the plurality of low frequency subband signals (602); - receiving a set of target energies, each target energy covering a different target range (130) within the high frequency range and being indicative of the desired energy of one or more high frequency subband signals (604) that are within the target range (130); - generating the plurality of high frequency subband signals (604) among the plurality of low frequency subband signals (602) and among a plurality of spectral gain coefficients associated with the plurality of subband signals of low frequency (602), respectively; - a means for receiving control data (603) indicating whether it is necessary to apply the plurality of spectral gain coefficients to generate the plurality of high frequency subband signals (604); and - adjusting the energy among the plurality of high frequency subband signals (604) using the set of target energies.
[0007]
7. Method for decoding a bit stream (704) representative of a low frequency audio signal (707) and a set of target energies (708) that describe the spectral envelope of a corresponding high frequency audio signal, the method characterized by the fact that it comprises the steps of: - determining a plurality of low frequency subband signals (706) associated with the low frequency audio signal (707) from the bit stream (704); - generating a plurality of high frequency subband signals among the plurality of low frequency subband signals and the set of target energies, according to the method as defined in claim 6; and - generating an audio signal from the plurality of low frequency subband signals and the plurality of high frequency subband signals.
[0008]
8. Method for generating control data (905) from an audio signal (903), the method characterized by the fact that it comprises the steps of: - analyzing the spectral shape of the audio signal (903) in order to determine a degree of discontinuity of spectral surroundings introduced when regenerating a high frequency component of the audio signal (903) from a low frequency component of the audio signal (903); and - generate control data (905) in order to control the regeneration of the high frequency component based on the degree of discontinuities; and determining the degree of discontinuity of spectral surroundings includes determining relationship information by studying the lower frequencies of the low frequency component and the higher frequencies of the low frequency component, where a high value of the determined relationship information is indicative a high degree of discontinuity of spectral surroundings and a low value of determined relation information is indicative of a low degree of discontinuity of spectral surroundings.
[0009]
9. Storage medium characterized by the fact that it is adapted for execution on a processor and for the execution of the steps of the method as defined in any of claims 6 to 8, when performed on a computing device.
类似技术:
公开号 | 公开日 | 专利标题
BR112012024360B1|2020-11-03|system configured to generate a plurality of high frequency subband audio signals, audio decoder, encoder, method for generating a plurality of high frequency subband signals, method for decoding a bit stream, method for generating control data from an audio signal and storage medium
同族专利:
公开号 | 公开日
KR20120123720A|2012-11-09|
BR112012024360A2|2016-05-24|
JP2020170186A|2020-10-15|
JP6523234B2|2019-05-29|
DK2765572T3|2017-11-06|
KR20200035175A|2020-04-01|
AU2018214048A1|2018-08-23|
PL3544007T3|2020-11-02|
AU2016202767A1|2016-05-19|
KR101907017B1|2018-12-05|
EP2596497B1|2014-05-28|
KR102304093B1|2021-09-23|
JP2015111277A|2015-06-18|
SG10202107800UA|2021-09-29|
ES2484795T3|2014-08-12|
PL3288032T3|2019-08-30|
JP2017062483A|2017-03-30|
US20190221220A1|2019-07-18|
KR20170130627A|2017-11-28|
KR101964180B1|2019-04-01|
PL3544008T3|2020-08-24|
HK1249653B|2020-01-03|
PL3291230T3|2019-08-30|
ES2727460T3|2019-10-16|
NO2765572T3|2018-01-27|
ES2644974T3|2017-12-01|
CN104575517A|2015-04-29|
EP3285258A1|2018-02-21|
MY154277A|2015-05-29|
MX2012010854A|2012-10-15|
AU2011281735B2|2014-07-24|
EP3288032B1|2019-04-17|
US20210366494A1|2021-11-25|
RU2018120544A3|2021-08-17|
SG183501A1|2012-09-27|
RU2659487C2|2018-07-02|
KR20170020555A|2017-02-22|
HK1199973A1|2015-07-24|
JP6035356B2|2016-11-30|
AU2014203424B2|2016-02-11|
AU2021277643A1|2021-12-23|
KR102159194B1|2020-09-23|
ES2801324T3|2021-01-11|
US9911431B2|2018-03-06|
RU2758466C2|2021-10-28|
EP3544007B1|2020-06-17|
EP3723089A1|2020-10-14|
CA2920930A1|2012-01-26|
RU2012141098A|2014-05-10|
EP2596497A1|2013-05-29|
AU2020233759B2|2021-09-16|
CN103155033A|2013-06-12|
KR20200110478A|2020-09-23|
EP3544008A1|2019-09-25|
JP2019144584A|2019-08-29|
CA3072785A1|2012-01-26|
EP3285258B1|2018-12-19|
CA2792011C|2016-04-26|
PL2596497T3|2014-10-31|
KR102026677B1|2019-09-30|
CA3027803A1|2012-01-26|
US20150317986A1|2015-11-05|
ES2807248T3|2021-02-22|
EP3544008B1|2020-05-20|
AU2014203424A1|2014-07-10|
WO2012010494A1|2012-01-26|
CA2792011A1|2012-01-26|
RU2014127177A|2016-02-10|
EP3291230A1|2018-03-07|
KR102095385B1|2020-03-31|
PL3285258T3|2019-05-31|
EP2765572A1|2014-08-13|
US20120328124A1|2012-12-27|
RU2018120544A|2019-12-04|
PL3544009T3|2020-10-19|
KR101478506B1|2015-01-06|
KR20130127552A|2013-11-22|
EP2765572B1|2017-08-30|
EP3291230B1|2019-04-17|
AU2011281735A1|2012-09-13|
KR20180108871A|2018-10-04|
CA3087957A1|2012-01-26|
PL2765572T3|2018-01-31|
SG10201505469SA|2015-08-28|
KR101709095B1|2017-03-08|
EP3544009A1|2019-09-25|
KR101803849B1|2017-12-04|
CA2920930C|2019-01-29|
AU2020233759A1|2020-10-08|
KR20190112824A|2019-10-07|
US9117459B2|2015-08-25|
EP3288032A1|2018-02-28|
US20180144753A1|2018-05-24|
ES2798144T3|2020-12-09|
EP3544007A1|2019-09-25|
JP5753893B2|2015-07-22|
US20170178665A1|2017-06-22|
EP3544009B1|2020-05-27|
CN104575517B|2018-06-01|
US10283122B2|2019-05-07|
DK2596497T3|2014-07-21|
MY177748A|2020-09-23|
JP6993523B2|2022-01-13|
CA3027803C|2020-04-07|
US11031019B2|2021-06-08|
JP2021092811A|2021-06-17|
CN103155033B|2014-10-22|
CA3146617A1|2012-01-26|
EP3723089B1|2022-01-19|
JP6845962B2|2021-03-24|
KR20190034361A|2019-04-01|
RU2530254C2|2014-10-10|
ES2727300T3|2019-10-15|
JP2022031889A|2022-02-22|
JP6727374B2|2020-07-22|
CL2012002699A1|2012-12-14|
AU2018214048B2|2020-07-30|
HK1249798B|2020-04-24|
JP2013531265A|2013-08-01|
AU2016202767B2|2018-05-17|
CA3072785C|2020-09-01|
ES2712304T3|2019-05-10|
US9640184B2|2017-05-02|
KR20210118205A|2021-09-29|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

EP0208712B1|1984-12-20|1993-04-07|Gte Laboratories Incorporated|Adaptive method and apparatus for coding speech|
DE3912605B4|1989-04-17|2008-09-04|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Digital coding method|
SE512719C2|1997-06-10|2000-05-02|Lars Gustaf Liljeryd|A method and apparatus for reducing data flow based on harmonic bandwidth expansion|
US6385573B1|1998-08-24|2002-05-07|Conexant Systems, Inc.|Adaptive tilt compensation for synthesized speech residual|
SE9903553D0|1999-01-27|1999-10-01|Lars Liljeryd|Enhancing conceptual performance of SBR and related coding methods by adaptive noise addition and noise substitution limiting |
SE0004163D0|2000-11-14|2000-11-14|Coding Technologies Sweden Ab|Enhancing perceptual performance or high frequency reconstruction coding methods by adaptive filtering|
SE0004187D0|2000-11-15|2000-11-15|Coding Technologies Sweden Ab|Enhancing the performance of coding systems that use high frequency reconstruction methods|
SE0004818D0|2000-12-22|2000-12-22|Coding Technologies Sweden Ab|Enhancing source coding systems by adaptive transposition|
JP3870193B2|2001-11-29|2007-01-17|コーディングテクノロジーズアクチボラゲット|Encoder, decoder, method and computer program used for high frequency reconstruction|
US20030187663A1|2002-03-28|2003-10-02|Truman Michael Mead|Broadband frequency translation for high frequency regeneration|
JP2004010415A|2002-06-06|2004-01-15|Kawasaki Refract Co Ltd|Magnesite-chrome spraying repairing material|
DE60327039D1|2002-07-19|2009-05-20|Nec Corp|AUDIO DEODICATION DEVICE, DECODING METHOD AND PROGRAM|
CN100492492C|2002-09-19|2009-05-27|松下电器产业株式会社|Audio decoding apparatus and method|
RU2353980C2|2002-11-29|2009-04-27|Конинклейке Филипс Электроникс Н.В.|Audiocoding|
KR100524065B1|2002-12-23|2005-10-26|삼성전자주식회사|Advanced method for encoding and/or decoding digital audio using time-frequency correlation and apparatus thereof|
JP2005040749A|2003-07-25|2005-02-17|Toyo Ink Mfg Co Ltd|Method for curing ultraviolet curing paint composition|
CN101556800B|2003-10-23|2012-05-23|松下电器产业株式会社|Acoustic spectrum coding method and apparatus, spectrum decoding method and apparatus, acoustic signal transmission apparatus, acoustic signal reception apparatus|
DE602004010188T2|2004-03-12|2008-09-11|Nokia Corp.|SYNTHESIS OF A MONO AUDIO SIGNAL FROM A MULTI CHANNEL AUDIO SIGNAL|
US8396717B2|2005-09-30|2013-03-12|Panasonic Corporation|Speech encoding apparatus and speech encoding method|
US20080071550A1|2006-09-18|2008-03-20|Samsung Electronics Co., Ltd.|Method and apparatus to encode and decode audio signal by using bandwidth extension technique|
US8295507B2|2006-11-09|2012-10-23|Sony Corporation|Frequency band extending apparatus, frequency band extending method, player apparatus, playing method, program and recording medium|
US8189812B2|2007-03-01|2012-05-29|Microsoft Corporation|Bass boost filtering techniques|
KR101355376B1|2007-04-30|2014-01-23|삼성전자주식회사|Method and apparatus for encoding and decoding high frequency band|
CA2697920C|2007-08-27|2018-01-02|Telefonaktiebolaget L M Ericsson |Transient detector and method for supporting encoding of an audio signal|
EP2045801B1|2007-10-01|2010-08-11|Harman Becker Automotive Systems GmbH|Efficient audio signal processing in the sub-band regime, method, system and associated computer program|
US8504377B2|2007-11-21|2013-08-06|Lg Electronics Inc.|Method and an apparatus for processing a signal using length-adjusted window|
CN101458930B|2007-12-12|2011-09-14|华为技术有限公司|Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus|
AT518224T|2008-01-04|2011-08-15|Dolby Int Ab|AUDIO CODERS AND DECODERS|
KR101413968B1|2008-01-29|2014-07-01|삼성전자주식회사|Method and apparatus for encoding audio signal, and method and apparatus for decoding audio signal|
KR101239812B1|2008-07-11|2013-03-06|프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베.|Apparatus and method for generating a bandwidth extended signal|
JP5419876B2|2008-08-08|2014-02-19|パナソニック株式会社|Spectrum smoothing device, coding device, decoding device, communication terminal device, base station device, and spectrum smoothing method|
JP2010079275A|2008-08-29|2010-04-08|Sony Corp|Device and method for expanding frequency band, device and method for encoding, device and method for decoding, and program|
SG172976A1|2009-01-16|2011-08-29|Dolby Int Ab|Cross product enhanced harmonic transposition|
DK2211339T3|2009-01-23|2017-08-28|Oticon As|listening System|
JP4945586B2|2009-02-02|2012-06-06|株式会社東芝|Signal band expander|
CN101521014B|2009-04-08|2011-09-14|武汉大学|Audio bandwidth expansion coding and decoding devices|
EP2239732A1|2009-04-09|2010-10-13|Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V.|Apparatus and method for generating a synthesis audio signal and for encoding an audio signal|
TWI556227B|2009-05-27|2016-11-01|杜比國際公司|Systems and methods for generating a high frequency component of a signal from a low frequency component of the signal, a set-top box, a computer program product and storage medium thereof|
BR122020007866B1|2009-10-21|2021-06-01|Dolby International Ab|SYSTEM CONFIGURED TO GENERATE A HIGH FREQUENCY COMPONENT OF AN AUDIO SIGNAL, METHOD FOR GENERATING A HIGH FREQUENCY COMPONENT OF AN AUDIO SIGNAL AND METHOD FOR DESIGNING A HARMONIC TRANSPOSITOR|
US9159337B2|2009-10-21|2015-10-13|Dolby International Ab|Apparatus and method for generating a high frequency audio signal using adaptive oversampling|
US9047875B2|2010-07-19|2015-06-02|Futurewei Technologies, Inc.|Spectrum flatness control for bandwidth extension|
PL3544008T3|2010-07-19|2020-08-24|Dolby International Ab|Processing of audio signals during high frequency reconstruction|US8971551B2|2009-09-18|2015-03-03|Dolby International Ab|Virtual bass synthesis using harmonic transposition|
WO2014060204A1|2012-10-15|2014-04-24|Dolby International Ab|System and method for reducing latency in transposer-based virtual bass systems|
JP5754899B2|2009-10-07|2015-07-29|ソニー株式会社|Decoding apparatus and method, and program|
JP5609737B2|2010-04-13|2014-10-22|ソニー株式会社|Signal processing apparatus and method, encoding apparatus and method, decoding apparatus and method, and program|
JP5850216B2|2010-04-13|2016-02-03|ソニー株式会社|Signal processing apparatus and method, encoding apparatus and method, decoding apparatus and method, and program|
PL3544008T3|2010-07-19|2020-08-24|Dolby International Ab|Processing of audio signals during high frequency reconstruction|
JP6075743B2|2010-08-03|2017-02-08|ソニー株式会社|Signal processing apparatus and method, and program|
JP5707842B2|2010-10-15|2015-04-30|ソニー株式会社|Encoding apparatus and method, decoding apparatus and method, and program|
CN104321815B|2012-03-21|2018-10-16|三星电子株式会社|High-frequency coding/high frequency decoding method and apparatus for bandwidth expansion|
US9173041B2|2012-05-31|2015-10-27|Purdue Research Foundation|Enhancing perception of frequency-lowered speech|
JP6305694B2|2013-05-31|2018-04-04|クラリオン株式会社|Signal processing apparatus and signal processing method|
ES2635026T3|2013-06-10|2017-10-02|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and procedure for encoding, processing and decoding of audio signal envelope by dividing the envelope of the audio signal using quantization and distribution coding|
KR101789083B1|2013-06-10|2017-10-23|프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝에.베.|Apparatus and method for audio signal envelope encoding, processing and decoding by modelling a cumulative sum representation employing distribution quantization and coding|
CN111477245A|2013-06-11|2020-07-31|弗朗霍弗应用研究促进协会|Speech signal decoding device and speech signal encoding device|
AU2014283285B2|2013-06-21|2017-09-21|Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.|Audio decoder having a bandwidth extension module with an energy adjusting module|
TWI557726B|2013-08-29|2016-11-11|杜比國際公司|System and method for determining a master scale factor band table for a highband signal of an audio signal|
US9666202B2|2013-09-10|2017-05-30|Huawei Technologies Co., Ltd.|Adaptive bandwidth extension and apparatus for the same|
EP3048609A4|2013-09-19|2017-05-03|Sony Corporation|Encoding device and method, decoding device and method, and program|
US10163447B2|2013-12-16|2018-12-25|Qualcomm Incorporated|High-band signal modeling|
AU2014371411A1|2013-12-27|2016-06-23|Sony Corporation|Decoding device, method, and program|
US20150194157A1|2014-01-06|2015-07-09|Nvidia Corporation|System, method, and computer program product for artifact reduction in high-frequency regeneration audio signals|
CN106409303B|2014-04-29|2019-09-20|华为技术有限公司|Handle the method and apparatus of signal|
EP2980794A1|2014-07-28|2016-02-03|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Audio encoder and decoder using a frequency domain processor and a time domain processor|
AR111047A1|2017-03-23|2019-05-29|Dolby Int Ab|INTEGRATION OF HARMONIC TRANSPOSITOR COMPATIBLE WITH PREVIOUS VERSIONS FOR THE RECONSTRUCTION OF HIGH FREQUENCY OF AUDIO SIGNALS|
EP3659040A4|2017-07-28|2020-12-02|Dolby Laboratories Licensing Corporation|Method and system for providing media content to a client|
TWI702594B|2018-01-26|2020-08-21|瑞典商都比國際公司|Backward-compatible integration of high frequency reconstruction techniques for audio signals|
MA50760A|2018-04-25|2020-06-10|Dolby Int Ab|INTEGRATION OF HIGH FREQUENCY RECONSTRUCTION TECHNIQUES WITH REDUCED POST-PROCESSING DELAY|
法律状态:
2018-12-26| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2019-09-10| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-06-02| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2020-11-03| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 14/07/2011, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US36551810P| true| 2010-07-19|2010-07-19|
US61/365,518|2010-07-19|
US38672510P| true| 2010-09-27|2010-09-27|
US61/386,725|2010-09-27|
PCT/EP2011/062068|WO2012010494A1|2010-07-19|2011-07-14|Processing of audio signals during high frequency reconstruction|
[返回顶部]